Multi-Document Summarization Using Document Set Type Classification

نویسنده

  • Jun-ichi Fukumoto
چکیده

In this paper, we propose a summarization system which automatically classifies type of document set and summarizes a document set with its appropriate summarization mechanism. This system will classify a document set into three types: (a) One topic type, (b) multi-topic type, and (c) others. These types will be identified using information of high frequency nouns and Named Entity. In our multi-document summarization system, unnecessary parts are deleted after summarizing each document and then multi-document summary is generated. In type (a), unnecessary parts are similar part between summarized documents by single document summarization. In type (b), unnecessary parts are unsimilar parts in documents. In type (c), unnecessary parts are identified by scores used for single document summarization.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving Multi-Document Summarization via Text Classification

Developed so far, multi-document summarization has reached its bottleneck due to the lack of sufficient training data and diverse categories of documents. Text classification just makes up for these deficiencies. In this paper, we propose a novel summarization system called TCSum, which leverages plentiful text classification data to improve the performance of multi-document summarization. TCSu...

متن کامل

A survey on Automatic Text Summarization

Text summarization endeavors to produce a summary version of a text, while maintaining the original ideas. The textual content on the web, in particular, is growing at an exponential rate. The ability to decipher through such massive amount of data, in order to extract the useful information, is a major undertaking and requires an automatic mechanism to aid with the extant repository of informa...

متن کامل

Analysis of Multi-Document Viewpoint Summarization Using Multi-Dimensional Genres

An interactive information retrieval system that provides different types of summaries of retrieved documents according to each user’s information needs can be effective for understanding the contents. The purpose of this study is to build a multi-document summarizer to produce summaries according to such viewpoints. As an exploratory stage of investigation, we examined the effectiveness of gen...

متن کامل

Extractive Summarization Using Multi-Task Learning with Document Classification

The need for automatic document summarization that can be used for practical applications is increasing rapidly. In this paper, we propose a general framework for summarization that extracts sentences from a document using externally related information. Our work is aimed at single document summarization using small amounts of reference summaries. In particular, we address document summarizatio...

متن کامل

A Survey For Multi-Document Summarization

Automatic Multi-Document summarization is still hard to realize. Under such circumstances, we believe, it is important to observe how humans are doing the same task, and look around for different strategies. We prepared 100 document sets similar to the ones used in the DUC multi-document summarization task. For each document set, several people prepared the following data and we conducted a sur...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004